My Datascience Journey
  • Home
  • Projects
  • Notes
  • Study
  • blogs
  • Python Package

  • Anomaly Detection
    • Anomaly Based IDS using ML
    • Anomaly Detection using online Event logs
    • Anomaly detection in dynamic graphs using MIDAS-R
    • Anomaly Detection using Unsupervised methods
  • 3D Deep Learning
    • 3D Data Formats
    • 3D coordination systems
    • 3D Rendering
    • Fitting Deformable Mesh Models to Raw Point Clouds
    • Differentiable Rendering
    • Neural Radiance Fields (NeRF)
    • Resources
  • ML Algorithms
    • Boosting
    • LogitBoost
    • Gradient Boosting
    • LightGBM
    • XGBoost
    • Catboost
  • Data Architecture
    • Big Data Architectures
    • Data Quality for ML
    • Feature Store
    • MLOPS
    • Model Deployment
    • Model Monitoring
  • Transformers
    • Computer Vision Using Transformers
    • Attention is all you need
    • Attention
    • Transformer
    • BERT
    • Transformers from Scratch
    • An Image is worth 16X16 Words
    • Vision Transformers (ViT)
    • How to Train Your ViT
    • Resources
  • Interpretable_ml
    • Introduction
    • Linear regression
    • Logistic Regression
    • Explainable Boosting Machines
    • Generalized Linear Models (GLM)
    • Decision Trees
    • Rulefit
    • Naive Bayes
    • Global Model Agnostic Methods
    • Local Model-Agnostic Methods
    • CNN Interpretation
    • Neural GAMs
    • Resources
    • Editable Interpretable Models
  • Graph Machine Learning
    • Graph Machine Learning
    • Resources
  • Industry Usecases
    • AI Use cases for the Insurance Industry
  • Bayesian Analysis
    • Bayesian Analysis
    • Resources
  • Causal Inference
    • Intro
    • Randomised Experiments
    • Stats Revisited
    • Graphical Causal Models
    • Packages
  • Computer Vision
    • Architecture for Image Classification
    • CNN Architectures
    • Object Detection
    • <<<<<<< HEAD Image Classification and Localization
    • ======= >>>>>>> refs/remotes/origin/main You Only Look Once (YOLO)
    • Images Classification Implementation
    • Image Segmentation
    • Image Segmentation
    • Architecutures for Image segmentation
    • OneFormer: One Transformer to rule Universal Image Segmentation
  • NLP
    • Text Preprocessing
    • Information Extraction
    • RNN & LSTM
    • Starspace
    • Transformer Family of Models
    • Text Summarization
    • GPT
    • BERT
    • Chatbots
    • Question Answering (QA)
    • Algorithms for Chatbot
    • InstructGPT
    • Making Transformers efficient in Production
    • Instruction Finetuned Text Embeddings
  • Data Science Project Lifecycle
    • Sampling
    • Training
    • Feature Engineering
    • ML Algorithms
    • Gradient Descent
    • Regularization
    • Model Development
    • Why ML system fails
    • MlOps
    • Resources
  • Math for AI
    • Introduction
    • Distributions
    • Fitting functions to data
    • Gradient Descent, Activations and Regularisation
  • Time Series
    • Time Series Introduction
    • Exploratory Analysis
    • Simulating Time Series Data
    • Feature Engineering for Time Series
    • Feature Engineering for Time Series
    • ML for Time Series
    • packages for Time series
  • Geograhic Data Processing
    • Geographic Data
    • Visualizing Buildings in a location along with its Area
    • Spatial Analysis using Geopandas
    • Coordinate Reference Systems (CRS)
    • Data Visualization using Folium
    • OpenStreetMap
    • Converting Data from Raster to Tabular (Geometry) format
  • Machine learning Implementations
    • EDA on Telecom Churn Data
    • Telecom Churn Prediction
  • Data Quality
    • Ensuring Data Quality
    • Create a new Datasource
    • Initialize a new Expectation Suite by profiling a batch of your data.
    • Create Checkpoint
  • Data Privacy
    • Approaches to Data privacy
    • Differential Privacy
  • Distributed Processing
    • Fugue
    • Fugue Quickstart
    • FugueSQL
  • <<<<<<< HEAD Pytorch ======= DSA >>>>>>> refs/remotes/origin/main
    • <<<<<<< HEAD Introduction to PyTorch ======= Insertion Sort >>>>>>> refs/remotes/origin/main
    • <<<<<<< HEAD Simple Neural Network in Pytorch
  • DSA
    • Insertion Sort
    • ======= >>>>>>> refs/remotes/origin/main Selection Sort
    • Bubble Sort
    • Merge Sort
    • Quick Sort
    • Binary Search
    • Binary Search Tree
    • Find Closest Value in BST
  • <<<<<<< HEAD System Design
      ======= System Design
    >>>>>>> refs/remotes/origin/main
  • Step by Step Guide for System Design
  • Scaling web services to millions of users
  • <<<<<<< HEAD probability
      ======= probability
    • >>>>>>> refs/remotes/origin/main
    • Probability
  • <<<<<<< HEAD Why Me
      ======= Why Me
        >>>>>>> refs/remotes/origin/main
      • Why me

    On this page

    • Data Visualization using Folium
      • Create a simple Interactive Map
      • Change the Style of the Map
      • Adding layers to the Map - Pin location
      • Mark good resedential areas in Pune
      • Create a Heatmap of the locations
      • Create a Clustered point Map
      • Create a Choropleth Map
      • Create Choropleth Map with Interaction

    Data Visualization using Folium

    import folium
    from pyproj import crs
    import geopandas as gpd
    import matplotlib.pyplot as plt

    Create a simple Interactive Map

    m = folium.Map(location=[20.59,78.96],zoom_start=5,control_scale=True)
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Change the Style of the Map

    m = folium.Map(
        location = [20.59,78.96],
        tiles = "Stamen Toner",
        zoom_start = 5,
        control_scale = True,
        prefer_canvas = True
    )
    
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Adding layers to the Map - Pin location

    #18.5786832,73.7666697
    m = folium.Map(location=[20.59,78.96],
                    zoom_start=5,control_scale=True)
    folium.Marker(
        location = [18.5786832,73.7666697],
        popup='Sai Eshnaya Apartments',
        icon = folium.Icon(color='green',icon='ok-sign')
    ).add_to(m)
    
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Mark good resedential areas in Pune

    points_fp = '../data/addresses.shp'
    points = gpd.read_file(points_fp)
    points.head()
    id addr geometry
    0 1000 Boat Club Road, 411001, Pune, Maharastra POINT (73.87826 18.53937)
    1 1001 Koregaon, 415501, Pune, Maharastra POINT (73.89299 18.53772)
    2 1002 Kothrud, 411038, Pune, Maharastra POINT (73.80767 18.50389)
    3 1003 Balewadi, 411045, Pune, Maharastra POINT (73.76912 18.57767)
    4 1004 Baner, 411047, Pune, Maharastra POINT (73.77686 18.56424)
    points_gjson = folium.features.GeoJson(points, name='Good Residential Areas')
    m = folium.Map(location=[18.5786832,73.7666697], tiles="cartodbpositron",
                    zoom_start=8,
                    control_scale=True)
    points_gjson.add_to(m)
    folium.LayerControl().add_to(m)
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Create a Heatmap of the locations

    points["x"] = points["geometry"].apply(lambda geom: geom.x)
    points["y"] = points["geometry"].apply(lambda geom: geom.y)
    
    # Create a list of coordinate pairs
    locations = list(zip(points["y"], points["x"]))
    from folium.plugins import HeatMap
    
    # Create a Map instance
    m = folium.Map(
        location=[18.5786832,73.7666697], tiles="stamentoner", zoom_start=10, control_scale=True
    )
    
    # Add heatmap to map instance
    # Available parameters: HeatMap(data, name=None, min_opacity=0.5, max_zoom=18, max_val=1.0, radius=25, blur=15, gradient=None, overlay=True, control=True, show=True)
    HeatMap(locations).add_to(m)
    
    # Alternative syntax:
    # m.add_child(HeatMap(points_array, radius=15))
    
    # Show map
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Create a Clustered point Map

    from folium.plugins import MarkerCluster
    # Create a Map instance
    m = folium.Map(
        location=[18.5786832,73.7666697], tiles="cartodbpositron", zoom_start=12, control_scale=True
    )
    # Get x and y coordinates for each point
    points["x"] = points["geometry"].apply(lambda geom: geom.x)
    points["y"] = points["geometry"].apply(lambda geom: geom.y)
    
    # Create a list of coordinate pairs
    locations = list(zip(points["y"], points["x"]))
    # Create a folium marker cluster
    marker_cluster = MarkerCluster(locations)
    
    # Add marker cluster to map
    marker_cluster.add_to(m)
    
    # Show map
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Create a Choropleth Map

    import geopandas as gpd
    from pyproj import CRS
    import requests
    import geojson
    
    # Specify the url for web feature service
    url = "https://kartta.hsy.fi/geoserver/wfs"
    
    # Specify parameters (read data in json format).
    # Available feature types in this particular data source: http://geo.stat.fi/geoserver/vaestoruutu/wfs?service=wfs&version=2.0.0&request=describeFeatureType
    params = dict(
        service="WFS",
        version="2.0.0",
        request="GetFeature",
        typeName="asuminen_ja_maankaytto:Vaestotietoruudukko_2018",
        outputFormat="json",
    )
    
    # Fetch data from WFS using requests
    r = requests.get(url, params=params)
    
    # Create GeoDataFrame from geojson
    data = gpd.GeoDataFrame.from_features(geojson.loads(r.content))
    
    # Check the data
    data.head()
    geometry index asukkaita asvaljyys ika0_9 ika10_19 ika20_29 ika30_39 ika40_49 ika50_59 ika60_69 ika70_79 ika_yli80
    0 POLYGON ((25472499.995 6689749.005, 25472499.9... 688 9 28.0 99 99 99 99 99 99 99 99 99
    1 POLYGON ((25472499.995 6685998.998, 25472499.9... 703 5 51.0 99 99 99 99 99 99 99 99 99
    2 POLYGON ((25472499.995 6684249.004, 25472499.9... 710 8 44.0 99 99 99 99 99 99 99 99 99
    3 POLYGON ((25472499.995 6683999.005, 25472499.9... 711 5 90.0 99 99 99 99 99 99 99 99 99
    4 POLYGON ((25472499.995 6682998.998, 25472499.9... 715 11 41.0 99 99 99 99 99 99 99 99 99
    from pyproj import CRS
    
    # Define crs
    data.crs = CRS.from_epsg(3879)
    # Re-project to WGS84
    data = data.to_crs(epsg=4326)
    
    # Check layer crs definition
    print(data.crs)
    EPSG:4326
    # Change the name of a column
    data = data.rename(columns={"asukkaita": "pop18"})
    data["geoid"] = data.index.astype(str)
    # Select only needed columns
    data = data[["geoid", "pop18", "geometry"]]
    
    # Convert to geojson (not needed for the simple coropleth map!)
    # pop_json = data.to_json()
    
    # check data
    data.head()
    geoid pop18 geometry
    0 0 9 POLYGON ((24.50236 60.31928, 24.50233 60.32152...
    1 1 5 POLYGON ((24.50287 60.28562, 24.50284 60.28787...
    2 2 8 POLYGON ((24.50311 60.26992, 24.50308 60.27216...
    3 3 5 POLYGON ((24.50315 60.26767, 24.50311 60.26992...
    4 4 11 POLYGON ((24.50328 60.25870, 24.50325 60.26094...
    m = folium.Map(
        location=[60.25, 24.8], tiles="cartodbpositron", zoom_start=10, control_scale=True
    )
    
    # Plot a choropleth map
    # Notice: 'geoid' column that we created earlier needs to be assigned always as the first column
    folium.Choropleth(
        geo_data=data,
        name="Population in 2018",
        data=data,
        columns=["geoid", "pop18"],
        key_on="feature.id",
        fill_color="YlOrRd",
        fill_opacity=0.7,
        line_opacity=0.2,
        line_color="white",
        line_weight=0,
        highlight=False,
        smooth_factor=1.0,
        # threshold_scale=[100, 250, 500, 1000, 2000],
        legend_name="Population in Helsinki",
    ).add_to(m)
    
    # Show map
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook

    Create Choropleth Map with Interaction

    # Convert points to GeoJson
    folium.features.GeoJson(
        data,
        name="Labels",
        style_function=lambda x: {
            "color": "transparent",
            "fillColor": "transparent",
            "weight": 0,
        },
        tooltip=folium.features.GeoJsonTooltip(
            fields=["pop18"], aliases=["Population"], labels=True, sticky=False
        ),
    ).add_to(m)
    
    m
    Make this Notebook Trusted to load map: File -> Trust Notebook
    email: tulasiram.gunipati@gmail.com